A proteomic perspective and involvement of cytokines in SARS-CoV-2 infection

Infection with the SARS-CoV-2 virus results in manifestation of several clinical observations from asymptomatic to multi-organ failure. Biochemically, the serious effects are due to what is described as cytokine storm. The initial infection region for COVID-19 is the nasopharyngeal/oropharyngeal region which is the site where samples are taken to examine the presence of virus. We have now carried out detailed proteomic analysis of the nasopharyngeal/oropharyngeal swab samples collected from normal individuals and those tested positive for SARS-CoV-2, in India, during the early days of the pandemic in 2020, by RTPCR, involving high throughput quantitative proteomics analysis. Several proteins like annexins, cytokines and histones were found differentially regulated in the host human cells following SARS-CoV-2 infection. Genes for these proteins were also observed to be differentially regulated when their expression was analyzed. Majority of the cytokine proteins were found to be up regulated in the infected individuals. Cell to Cell signaling interaction, Immune cell trafficking and inflammatory response pathways were found associated with the differentially regulated proteins based on network pathway analysis.


Introduction
The COVID19 pandemic has led to extensive investigations on multiple aspects of the biology of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus as well as hostresponses [1,2]. Despite extensive investigations, as of now, vaccination appears to be the only way to avoid or reduce infection by the virus [1,2]. To date, testing for the virus relies on swabs from nasopharyngeal region followed by detection of the virus by Real time polymerase cycle reaction (RTPCR) [1][2][3]. The nasopharynx is the initial area of infection and is close to the lungs which are affected in severe cases of SARS-CoV-2 infection. Hence, it is important to investigate early events at the initial site of infection immediately after detection by RTPCR. The early events related to host-response at the site of infection, may give a clue about the disease progression and the protein regulation at an early stage of infection.
There have been detailed proteomics studies on the serum of SARS-CoV-2 patients were conducted using liquid chromatography with Tandem mass spectrometry (LC-MS/MS) [4][5][6][7][8][9][10]. The proteins from serum were digested with trypsin and the peptides were subjected to LC-MS analysis [4,6,8]. In two studies [5,10], peptides obtained from digesting serum or cellular a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 proteins were pre fractionated by LC, fractions pooled and the pooled peptides were subjected to LC-MS/MS analysis. D'Alessandro et al have examined the effect of IL-6 levels on coagulation and complement status using nano Ultra-High-Pressure Liquid Chromatography-Tandem Mass Spectrometry [4]. The study by Villar et al describes dysregulation of proteins on COVID-19 infection and have suggested that immune related markers would aid in monitoring the pandemic [7]. Park et al describe changes of plasma proteins and differences between severe and non-severe COVID-19 patients [8]. Shen et al have reported detailed proteomic and metabolic analysis from serum of patients with COVID19 using Ultra performance liquid chromatography/ Tandem Mass Spectrometry (UPLC-MS/MS). Their detailed study indicated massive metabolic suppression and molecular changes in blood induced by SARS-CoV-2 [5]. Messner et al have developed a high throughput platform for serum and plasm proteomic analysis developed for clinical use. Their study revealed differentially expressed proteins that may have potential as biomarkers. The proteins included complement factors and inflammation modulators [6]. However, only one study has been conducted to explain the immune response on the nasopharyngeal (NP) region [10]. Based on proteomic analysis of samples from the NP region [10], Vanderboom et al. explained that upon SARS-CoV-2 infection, the innate immune system activated interferon (IFN)-mediated proteins on the NP region, which includes interferon-stimulated genes, pathogen-recognition receptors, transcription factors, and proinflammatory cytokines [10]. Interestingly, only one common up-regulated protein was found between NP and serum samples. There was no down-regulated protein in common [10]. The functions of nasal-associated lymphoid tissue (NALT) and its immune cells such as B cells, T cells, macrophages, and dendritic cells on innate and adaptive immunity following SARS-CoV-2 infection have not been well described. The mucosal immune system is the largest; it mediates both innate and adaptive immunity against SARS-CoV-2 infection by circulating antibodies such as Secretory (SIgA), IgA and IgG via (Nasal-associated lymphoid tissue) NALT to neutralize viral infection [11]. Hence, it is relevant to examine the effect of virus attack on the cells of this region. We have shown that upon infection, there is down regulation of defensin genes [12]. Defensins are crucial components of innate immunity and are the first line of defense against invading pathogens [12,13]. In the present study, we have now analyzed samples obtained during early days of the pandemic in 2020, for expression of protein profiles, particularly their regulation with respect to controls. We have also in parallel examined differential expression of different cytokines. We have observed up regulation of several proteins related to inflammation and innate immunity. Of particular interest is the up regulation of annexins. Auto antibodies to annexins have been observed in COVID19 patients [14]. The up regulated proteins may result in the production of auto antibodies which contributes to metabolic dysfunction observed in the disease. We have observed up regulation of several cytokines at the genomic and protein level. These cytokines play a role in inflammation. Our results indicate that metabolic stress due to viral infection is observed in nasopharyngeal cells at the onset of infection.

Results
The Nasopharyngeal region comprises of a heterogeneous population of cells that include epithelial and immune cells [11,15]. The Nasopharyngeal region is also gateway to the lungs which is affected badly when the illness is serious. The detection of SARS-CoV-2 infection involves RTPCR from Nasopharyngeal/oropharyngeal swab (NOPS) samples. We have earlier shown that analysis of cells from this region indicates down regulation of defensin gene expression [12]. We describe in this paper, differential proteomics analysis and expression of some cytokine genes from NP swabs from control population and those infected with SARS--CoV-2 as indicated by RTPCR.

Proteomics analysis
Six major proteins important for virus entry into cells replication such as ORF1ab, N, nsp9, S, nsp3 and H [1] were found expressed in the infected NOPS following high throughput proteomic analysis (Table 1). A total of 37 proteins of host humans were found differentially regulated on infection with SARS-CoV-2 (Table 2) (Fig 1). The differentially regulated proteins include 21 up regulated and one down regulated protein for having more than one log fold differential regulation (Table 2) (Fig 1). Proteins such as HIST1H3A, H2AFZ, HIST1H4A, S100A9, GSTA1, DPYSL5 and S100A8 were found to be up regulated by 4 log fold changes. ANXA2, BPIFA1, AXNA1, HIST1H2BK and AKR1C4 were up regulated by 3 log fold change. Proteins that were up regulated 1-3 fold were: LTF, ATP5F1A, YWHAQ, ANXA5, LYZ, PRDX1, DMBT1, LGAL53 and CAPS (Table 2) (Fig 1). Many of the up-regulated proteins are involved in host-defense functions and inflammatory reactions. The data generated from the proteomic analysis were submitted to PRIDE database and obtained PXD032150 project accession number (DOI: 10.6019/PXD032150).

Gene expression analysis
Genes of proteins involved in immune responses [16] and those that were observed to be up regulated in the proteomics analysis were analyzed for their expression involving RTPCR during SARS-CoV-2 infection using respective gene specific primers (Table 3). Based on RTPCR analysis it was found that majority of the selected genes were significantly up regulated on viral infection. IL1b, IL11, IL6, ICAM1, VCAM1, HIST1H2BK, HIST1H4A, H2AFZ, HIST1H3A, SERPINB3, DMBT1, ANXA1, ANXA2 were found up-regulated by one log fold change (Fig 2). LYZ is the only gene which is found down regulated upon SARS-CoV-2 infection (Fig 2). Proteins which were found up regulated were also up regulated at the gene level through RTPCR.

Network pathway analysis
Based on network pathway analysis it was found that inflammatory response network pathway was highly associated with the differentially regulated proteins and genes observed and selected in the study. Genes/proteins (16 in number) which were found differentially regulated in the NOPS samples for SARS-CoV-2 infection were found associated with inflammatory response pathway ( Fig 3A). Cellular movement/cell to cell signalling and cellular assembly/organization were the two major canonical pathways found to be associated with the differentially regulated proteome of infected NOPS (Fig 3B). The major functions associated with the identified and differentially regulated proteome includes cell to cell signalling and interaction, immune cell trafficking, cellular movement, inflammatory response and hypersensitive response (Fig 4).

Cytokine array and western blot analysis
Based on cytokine array analysis it was found that almost all the proteins represented for the cytokine pathway [16] were found to be up regulated in the infected Nasopharyngeal region samples against the control NOPS samples (Fig 5). Majority of the interleukins, VEGF, CNTF and CINC-2 were found to be majorly up regulated for COVID infection based on cytokine array analysis (Fig 5). Based on western blot analysis it was found that ANXA1, ANXA2 and ANXA5 were found to be up regulated by several fold in SARS-CoV-2 positive NOPS samples against control NOPS samples (Fig 6). ANXA2 was found up regulated majorly in the infected samples.

Discussion
The initial site of SARS-COV-2 infection is the upper airways. As the disease progresses, respiratory difficulties have been observed in severe cases [2]. The expression of two main

PLOS ONE
Proteomic analysis of SARS-CoV-2 infection SARS-COV-2 receptors, angiotensin converting enzyme-2 (ACE2) and transmembrane protease serine 2 (TMPRSS), was found to be high in the upper airway tract mucosa [11]. The nasopharyngeal region plays a major role in mucosal immunity against SARS-COV-2 infection by activating lymphocytes, B cells, and other cellular components, as well as activating antigen

PLOS ONE
specific immunity [4]. The oral cavity was also reported to be implicated in various immunological responses to SARS-COV-2 infection [11]. In fact, the test for viral infection is from swabs from this region which are subjected to RTPCR. The swabs would contain epithelial and immune cells [15,17]. We have shown that when cells from this region were analyzed for the expression of host-defense peptides-the defensins, they were significantly down-regulated [12]. In this paper, we described the effect of SARS-CoV-2 infection in patients on the expression of cytokine genes and proteomics profile of cells from the nasopharyngeal region using high throughput comparative proteomic analysis. We have analyzed pooled samples from individuals who were tested COVID19 positive by RTPCR test and individuals who were COVID19 negative. Samples were obtained from individuals during early days of the pandemic in 2020. The major viral proteins were detected in  the infected nasopharyngeal samples. They were SARS-CoV-2 non-structural proteins that are responsible for viral transcription, replication, proteolytic processing, suppression of host immune responses and suppression of host gene expression ( Table 1). The nucleocapsid protein is an RNA-binding protein that is essential for viral assembly into a ribonucleoprotein complex and also functions in viral budding.
The host proteins found to be differentially regulated for SARS-CoV-2 infections are involved in various aspects of immune response, particularly neutrophil activation. Our proteomic profile shows that HIST1H3A, H2AFZ, HIST1H4A, GSTA1, DPYSL5, S100A9, and S100A8 proteins were found to be up regulated by more than 4 log fold changes. Several neutrophil activated proteins and DNA binding protein such as Histones have a significant role in

PLOS ONE
inflammation (Figs 1 and 2). Histones are prominent component of neutrophil extracellular traps (NETs), which causes inflammation and thrombosis [18]. Neutrophil activated proteins S100A8/A9 or calprotectin [19], DPYSL5 [20], and neutrophil elastase (ELANE) expression were all linked to COVID-19 infection and severity through the formation of NETs, which causes acute lung injury [21]. It is noteworthy to explore anti-neutrophil therapy and suggest that it could be a potential therapeutic target for viral infections.
In SARS-COV-2 infection, reactive oxygen species (ROS) cause oxidative stress in host cells and stimulate the production of cytokines, antioxidants, transcription factors, dendritic cells, and macrophages [22]. Our findings (Table 2) show the detoxifying enzyme Glutathione S transferase (GSTA1) [22], antioxidant Peroxiredoxin-1 (PRDX1) [22], and antibacterial enzyme Lysozyme C (LYZ) [4,23] protein upregulation during SARS-COV-2 infection, indicating a significant oxidative defense mechanism. The severity of COVID-19 disease has been linked to elevated levels of ROS production [24]. As a result, our study strongly suggests that further research on ROS-mediated markers will help in identifying infection severity at an early stage.
Several other proteins as described in Table 2 are also associated with inflammatory responses and apoptosis. It is interesting to see the up regulation of Annexins (Figs 2 and 6) ( Table 2). Annexin family proteins are calcium dependent phospholipid binding proteins. Annexin A1 (ANXA1), Annexin A2 (ANXA2), Annexin A5 (ANXA5) are predominantly found in SARS-CoV-2 infected host cells and are involved in the pro-inflammatory response and thrombosis [25,26]. Recent reports indicate that auto-antibodies to Annexin 2 is detected in COVID19 patients, suggesting that it plays an important role in pathophysiology [14,26].
Our proteomic findings are also consistent with in vitro studies that showed SARS-CoV-2 infection activated host immune proteins. Cytokines are a major component of the innate immune response and are produced by lymphocytes and granulocytes [27][28][29]. During inflammatory reaction, Host cells pathogen-recognition receptors (PRRs) triggers cytokines in host system [27][28][29][30]. Cytokines are divided into two types: pro-inflammatory cytokines, which release excessive cytokines in response to infection, and anti-inflammatory cytokines, which regulate pro-inflammatory cytokines [27][28][29]. Pro-inflammatory cytokines such as IL-1, IL-10, IL-6, and TNFα were found in significant concentrations in COVID-19 individuals, causing a cytokine storm during SARS-CoV-2 infection, which has been linked to ARDS, multi-organ failure, and increased candidate risk [28,31]. Tumor necrosis factor α (TNF-α), regulates leukocyte trafficking by stimulating cell adhesion molecules such ICAM-1, VCAM-1, and selectins and it has been linked to a wide range of immunological disorders [32]. Our study found that pro-inflammatory cytokines such as IL-1, IL-6, IL-10, and TNFα, as well as other cytokines proteins were upregulated (Fig 5). At genomic expression, IL-6 found to be significantly up regulated with more than 1 log fold change (Fig 2). Despite the lack of disease severity data, our findings are consistent with previous research [4] and suggest that IL-6 can be used as a marker for COVID-19 disease severity. Some proteins, such as Deleted in malignant brain tumours 1 (DMBT1) and Lactotransferrin (LTF), bind to growth factors and inhibit IL-6 production [33,34], suggesting that they could be antiviral. Our study indicates that a preliminary screen of host cells using global proteomics provides a detailed view of viral effects and leads to potential therapeutics.
Although infection by corona virus was manifested in a mild form earlier on [1,2], the pandemic due to the corona virus SARS-CoV-2 that started in 2019 had a devastating effect on health services globally. Despite extensive efforts, there is no drug specifically to treat COVID19. There have been extensive researches on various aspects on the virology of SARS--CoV-2, not only to understand the disease, but also to devise effective treatment. Development of vaccines at warp speed has been effective in controlling the disease [35], but data on long term protection are not available. The symptoms are highly variable among infected individuals and also appear to depend on the infecting strain. Responses by various populations may differ considerably. Hence, it is important to investigate protein and gene expression from patients from different geographical locations and ethnicity. It is important to examine various biochemical parameters on the onset of infection particularly from the initial site of infection and during early days of the pandemic, to understand various aspects of the disease. By a combination of proteomics analysis, gene expression studies as well as in the blot array, we were able to detect several proteins related to inflammation and auto immunity such as histones and annexins and up regulated cytokines. Clearly a multi-dimensional approach is necessary to obtain a complete picture on SARS-CoV-2 infection. Such an approach would facilitate therapeutic interventions.

Sample collection
Human nasopharyngeal/oropharyngeal swab samples were collected from CSIR-Centre for Cellular and Molecular Biology (CCMB) COVID19 diagnostic facility archive in viral transport media (VTM) [12]. A total of 80 VTM swab samples (collected from Government hospital, Hyderabad, Telangana, India during early phase of COVID infection in 2020) were selected for genomics analysis. A further 12 VTM swab samples were selected for quantitative differential proteomics analysis. The samples were inactivated in their respective lysis buffer. The study was executed as per the approval of CCMB Institutional biosafety and institutional ethical approval (82/2020).

Protein extraction and iTRAQ labelling
A total of 12 VTM swab samples were grouped into 2 with each group consisting of 3 positive and 3 negative SARS-CoV-2 VTM samples were taken for the quantitative differential proteomics study. Total protein was extracted from the pooled SARS-CoV-2 VTM samples at the CCMB BSL3 lab facility. The pooled samples were centrifuged at high RPM, and the resulting pellet was washed with 1% PBS, resuspended in protein solubilization buffer, (7 M urea, 2 M thiourea, 18 mM Tris-HCl, 4% CHAPS, 14 mM Trizmabase, Triton X 0.2%, 50 mM DTT and EDTA free protease Inhibitor cocktail mix) and sonicated for 10 minutes. After brief centrifugation, the supernatant was collected and quantified using the amido black method [36]. A total of 200ug of protein was in-gel trypsin digested and labelled with iTRAQ label as per previously mentioned protocol [36][37][38][39][40]. SARS-COV-2 negative samples were labelled with iTRAQ 114 and 115, and positive samples were labelled with 116 and 117. All the labelled peptides were pooled and analysed using Liquid Chromatography Mass Spectrometry (LCMS/ MSMS) in the Q-Exactive HF mass spectrometer coupled to EASY-nLC 1200 system. Peptides were eluted through PepMap TM RSLC C18 column (Thermo Scientific) with 150 min gradient. The electro spray voltage was set to 2.2 KV and the capillary temperature was set to 310˚C. The mass spectrometers were used in a Top 10 Data Dependent DD-MS2 and selected ion monitoring data dependent acquisition ddSIM. The raw data were analyzed using Sequest HT proteome discoverer 1.4 (Thermo Scientific), with 1% FDR percolator and XCorr (Score Vs Charge). The parameters are summarized in Supplementary Table 1. The obtained data were analyzed against the human proteome and SARS-CoV-2 proteome database separately. The identified SARS-COV-2 and host proteins were listed. Differential expression analysis was carried for the obtained host proteins against control SARS-CoV-2 negative samples. The proteomic data obtained from the study was submitted to PRIDE database.

Real-time PCR analysis
Total RNA was extracted from the selected individual VTM swab samples using King-FisherTM Flex System (Thermo Fisher Scientific Inc., USA). A total of 80 VTM RNA samples were quantified and grouped into 8 with each group consisting 100ng of 5 positive and 5 negative samples. cDNA was synthesized (BioRad, USA) from 200 ng of each VTM group. Realtime PCR analysis was performed for the selected list of the gene using Applied Biosystems-ViiA™ 7 Real-Time PCR System (USA) with gene-specific primers (Table 3). All primers were synthesized using Primer3 software and, the GAPDH gene was used for the data normalization. TB Green Premix Ex Taq 11 (TliRNaseH Plus) kit (Takara, Japan) with respective qRTPCR conditions (Annealing Tc-55˚C or 60˚C) and melt curve analysis were followed as previously mentioned protocol [12]. Differential expression analysis was performed using the obtained cycle threshold value.

Cytokine array and western blot analysis
Cytokine expression analysis was performed using Cytokine Array-RAT Cytokine antibody array (Abcam, USA) kit. A total of 300 μg proteins from two groups consisting of 6 positive and 6 negative samples were immunoblotted in the kit as per the manufacturer's protocol. The obtained spot patterns were densitometrically analyzed using ImageJ software to estimate the expression level of cytokines in SARS-CoV-2 infection.
Protein expression analysis of ANXA1, ANXA2 and ANXA5 protein were analyzed involving western blot analysis. 25 μg of total protein electrophoresed in 10% SDS-PAGE and, immune blotted against respective antibodies [41]. GAPDH was used as a housekeeping protein.

Pathway analysis
The proteins and genes selected for the study were analyzed for network and pathway analysis using Ingenuity Pathway Analysis software-based. The network pathway, regulator effect and disease and function associated with the genes/protein were mapped.